CSS-LM: A Contrastive Framework for Semi-Supervised Fine-Tuning of Pre-Trained Language Models
نویسندگان
چکیده
Fine-tuning pre-trained language models (PLMs) has demonstrated its effectiveness on various downstream NLP tasks recently. However, in many low-resource scenarios, the conventional fine-tuning strategies cannot sufficiently capture important semantic features for tasks. To address this issue, we introduce a novel framework (named "CSS-LM") to improve phase of PLMs via contrastive semi-supervised learning. Specifically, given specific task, retrieve positive and negative instances from large-scale unlabeled corpora according their domain-level class-level relatedness task. We then perform learning both retrieved original labeled help crucial task-related features. The experimental results show that CSS-LM achieves better than strategy series with few-shot settings, outperforms latest supervised strategies. Our datasets source code will be available provide more details.
منابع مشابه
Semi-Supervised Learning of Statistical Models for Natural Language Understanding
Natural language understanding is to specify a computational model that maps sentences to their semantic mean representation. In this paper, we propose a novel framework to train the statistical models without using expensive fully annotated data. In particular, the input of our framework is a set of sentences labeled with abstract semantic annotations. These annotations encode the underlying e...
متن کاملDiscriminative Models for Semi-Supervised Natural Language Learning
An interesting question surrounding semisupervised learning for NLP is: should we use discriminative models or generative models? Despite the fact that generative models have been frequently employed in a semi-supervised setting since the early days of the statistical revolution in NLP, we advocate the use of discriminative models. The ability of discriminative models to handle complex, high-di...
متن کاملeffects of first language on second language writing-a preliminary contrastive rhetoric study of farsi and english
to explore the idea the investingation proposed, aimed at finding whether the performances of the population of iranians students studying english in an efl context are consistent in l1 and l2 writing taks and whether there is a cross-linguistic transfer in this respect. in this regard the subjects were instructed to write four compositions-two in english and two in farsi-which consisted of an ...
15 صفحه اولensemble semi-supervised framework for brain mris tissue segmentation
brain mr images tissue segmentation is one of the most important parts of the clinical diagnostic tools. pixel classification methods have been frequently used in the image segmentation with two supervised and unsupervised approaches up to now. supervised segmentation methods lead to high accuracy but they need a large amount of labeled data, which is hard, expensive and slow to obtain. moreove...
متن کاملA Supervised Method for Constructing Sentiment Lexicon in Persian Language
Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing
سال: 2021
ISSN: ['2329-9304', '2329-9290']
DOI: https://doi.org/10.1109/taslp.2021.3105013